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^0 , Abstract: In this paper, we are basically discussing on a class of Baranchik 

ci ' type shrinkage estimators of the vector parameter in a location model, with er- 

rors belonging to a sub-class of elliptically contoured distributions. We derive 
conditions under Schwartz space in which the underlying class of shrinkage 
estimators outperforms the sample mean. Sufficient conditions on dominant 
class to outperform the usual James-Stein estimator are also established. It 
is nicely presented that the dominant properties of the class of estimators are 
^ ' robust truly respect to departures from normality. 
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^ ! 1 Introduction 



c^ 



X 



The assumption of normality restricts the range of possible applications, especially in flat- 
ter densities. The elliptically contoured distributions (ECDs) are the parametric forms 
of the spherical symmetric distributions, which are invariant under orthogonal transfor- 
C^ ' mations and have equal density on sphere if densities exist. ECD's primary purpose is 

to provide a highly impressive list of heavier/lighter tail alternatives to the multivariate 
Gaussian models. Materials involving vector-variate distributional properties and infer- 
ential problems will be found entirely in the work of a couple of statisticians like Das 
Gupta et al. (1972), Cambanis et al. (1981), Muirhead (1982), Anderson et al. (1986), 
Cellier et al. (1989), Anderson and Fang (1990), Fang and Zhang (1990), Fang et al. 
(1990) and Kibria and Haq (1999). Among others, the book of Gupta and Varga (1993) 
illustrates some significant results dealing with matrix- variate ECD. In addition, some of 
the well-known elliptical distributions are the Gaussian, Pearson Type II/VII, Student's 
t, logistics, Kotz type, Laplace, Bessel and power exponential multivariate distributions. 
In this paper, we consider the location model in a more general setup involving 
dependent errors. Initially let S{p) denotes the set of all p x p positive definite matrices. 



The precise set-up of the problem is as fohows: Let Vj be an p x 1 response vector with 
model 

Y, - e + e„ l<i<N. (1.1) 

Here is a p x 1 vector of location parameters and e^ is a p x 1 error vector such that 

S(e,) = 0, Cow(e,e,) = S e 5(p), ^, j = 1, • • • , A^, N > p. (1.2) 

It is assumed, in general, e — [ei,- ■ ■ , e^Y have a joint elliptically contoured distribution. 
Typically if it possess a density, it is followed by 



/(e|S)(x|S|^ghrsf^e,en, 



(1.3) 



where g{.) is a non- negative function over R-)_ such that /(.|.) is a density function with 
respect to (w.r.t.) a cr-finite measure /i on W. In this case, notation e^ ~ £p{0,'E,g) 
would probably be used. 

Due to Chu (1973), each component of the aforementioned model being proposed in 
(|1.3p . possibly can be presented as the following form. 

/"OO 

Uix)^ W(i)0Ar^(o,t-is)(a;)dt, (1.4) 

Jo 

where 0AAp(o,t-is)(-) is the pdf of A/'p(0, ^-^S), 

W(t) = (27r)§|S|h-«/:-i[/(s)], (1.5) 

£~^[/(s)] denotes the inverse Laplace transform of /(s) with s — i[a;'S^ 2;/2]. 

The inverse Laplace transform of /(.) exists provided that the following conditions 
are satisfied. 

(i) f{t) is differentiable when t is sufficiently large. 

(ii) f{t) = o(i-™) as t -> oo, m > 1. 

Although, it is rather difficult to derive the inverse Laplace transform of some functions, 
we are able to handle it for many density generators of elliptical densities. Refer to 
Debnath and Bhatta (2007) for more specific details. 

The mean of e^ is the zero-vector and the covariance-matrix of e^ is 

/"OO 

Cov{e,) = / Cov{ei\t)W{t)dt 
Jo 

/•oo 



W{t)Cov {Mp{0,t-^T,)} dt 
I t-^W{t)dt] s, 



(1.6) 



provided the above integral exists. 



Another sub-class of ECDs which includes the above class may be generated by a 
signed measure W on the measurable space (R+,B) such that the pdf /(.) can be ex- 
pressed through the following procedures: 

/■oo 

(i) f{x)= </'AA,(o,t-i:)(a;)W(dt), (1.7) 

/"OO 

(a) / t-^W+{dt) < oo, 

Jo 

/•oo 

(m) / r^W {dt) < oo, 



where >V+ — W is the Jordan decomposition of W in positive and negative parts (see 
e.g. Srivastava and Bilodeau, 1989). Note that from (m) — (in) of (|1.7p . 



t-^Widt) < oo (1.8) 

and thus, Cov(ei) exists under the sub-class defined above. 

Now, under Bayesian framework, it is properly assumed that in distribution, little is 
known ofcourse, a priori, about the parameters, the elements of 6 and the p{p + l)/2 
distinct elements of S G S{p). We shall first of all suppose that the elements of and 
those of S are approximately independent (see Box and Tiao, 1992, page 425), i.e. 

7r(6>,S) = 7r(6/)7r(S). (1.9) 

Using the invariant theory due to Jeffreys (1961), we take 

tt{9) oc constant, 
7r(S) (X ISr^i, 

as the prior knowledge about the parameter space. 

Next step being taken, is giving results for the marginal posterior distribution of the 

location parameter given responses. 

Lemma 1.1. Assume in the location model (jl.ip . e^ ^ £p{0,H, g), where S € S{p). 
Then, w.r.t. the prior distribution given by (jl.9p . the posterior distribution of is 
multivariate Student's t distribution, denoted by 0\Y ~ tp{Y , S, N—p), with the following 
pdf 



f{e\Y)= ^27 1 I 



i + N{e-Y)'s^'^{e-Y) 



.t(iv^p)^r(V) 

where Y = (^i, • • • , Y i^), and 

N N 



i=l 



Proof: Using Proposition 1 of Ng (2002), one can directly obtain 



fid\Y) ex 



N 



Y,{Y^-e){Y,-ey 



i=l 



(1.11) 



which is the same as we take the errors to be normally distributed (Zellner, 1971, P. 243). 
At this level, through making conclusion based on the following equality 

{Y,-e){Y,-ey = {Y,-Y){Y,-Yy + {Y-e){Y -ey 
+2{Y,-Y){Y~ey, 



we observe 



N 



Y,{Y.-e){Y,-ey 



\S + NA\, 



(1.12) 



where A = (6> - F)(0 - F)'. 

However, by taking advantage from Corollary A. 3.1 of Anderson (2003) we reach the 
point that 



\S + NA\ = 



S ^VN{9 - Y) 

N{9-Yy 1 

1 VN{e - Yy 

-^/N{9 -Y) s 

S\{1 + N{e - YyS-\9 - Y)} 



(1.13) 



Therefore, by making use of equations (|l.lip - (|1.13p we come to realize the following 
formula 



f{e\Y) = c{N,p)\s\--{i + Nie-Yys-'{e-Y)y 



where 



ciN,p) 



eesRp 



151* 



\s\-^{i + N{9 - Yys-'{e - Y)}-^de 



' N^ 



[N{N-p)]^ 



A^^r(f)|5|^ 

^^N^p)'^^{^y 

This would prove our claim. 

Throughout this paper, we shall also assume that the loss function is given by 



L{d- 6) = N{e - eyi:-\e - e) 



(1.14) 



for any estimator 6 oi 6. 

It has been fully known that the Bayes estimator of 6 with respect to the loss (|1.14[) is 

the posterior mean (Proposition 2.5.1, Robert, 2001) given by 

e = Y. (1.15) 

As it can be realized from the estimator given by (J1.15p . the Bayes estimator, reduces to 
the sample mean, under the setup presented above. So there is no need to deal with the 
Bayesian aspects of 9, and along the paper, we in fact concern sample mean rather than 
the Bayes estimator. 

Then, from ECD properties (see Fang et al., 1990) we have 

e^£p{e,N-^i:,g). (i.ie) 

Under classical viewpoint, we devote a general class of Stein-type shrinkage estimators 
to the estimator 6, given by 



Srie) = 



ries^^e 



e s-^e 



e, (1.17) 



where r : [0,oo) — > [0,oo) is an absolutely continuous function. 

Furthermore, r e iS(R+,/^), (the Schwartz space or space of rapidly decreasing func- 
tions on M+ with the measure /i) where 

S{R+,fi) = {reC°^{R+,fi): \\r\\a,p < (x Va,/3}, 

a and /3 are indices, C°°(R+, /i) is the set of all smooth functions from R+ to C (the set 
of all complex numbers) and 

Iklla./g — ||a;"D r||oo = sup{|a;"D^r(x)| : x G domain of r}. 

Here D^ stands for /3*'' derivative of r. See Folland (1999) for more details. 

The latter condition plays strategic position in gaining main result. Note that for 
every function such as r(.) belongs to 5(R+,^), we have 



r'{x)dfi{x) < oo, (1.18) 

r^{x)dn{x) < oo, (1.19) 

'0 

More interesting that the Schwartz space is dense in the space of all functions satisfy the 
above conditions in (1.18) and (1.19). 

The objective of this study is to construct conditions on r(.) under which 6^(9) 
performs better than 6 in the sense of having smaller risk w.r.t. the loss function given 

by dni. 

This study is highly motivated by the work of Srivastava and Bilodeau (1989). They 
chewed over a similar class of estimators to (|1.17|) . substituting the function r(.) with a 
constant under classical decision theory. Although, as noted above, considering Bayesian 
point of view does not offer substantial generality, taking vague prior, over the work of 
Srivastava and Bilodeau (1989), because of (|1.16p . the class specified in (|1.17p contains 
the class which was previously stated as a special case. 



2 Risk Derivations 



In this section, we give some lemmas to evaluate the risk function of Sr{9). Provided 
that if all expectations exist, we deserve the following Lemma. 

Lemma 2.1. If x ^ Afp{d,a'E), a > 0, S S S{p) is independent of S ^ Wp(/3S,n), 
f3>0, n^ N -1, then 



E 



„'-<p-i 



T.-^{x-e)r{x'S~'^x) 
x'S^^x 



= l3a{n-p+\)l{p-2)E 
+2E[r'{x'S-^x)\ 



and 



E 



x'H-^xr'^ [x'S-^x) 
{x'S-^xf 



= /32(n-p+l)(n-p + 3)£; 



r{x'S-^x) 
x'YT^x 



r'^{x'S-^x) 
x'Yr^x 



Proof: Let y = S"5a;, 9* = S"^0, and A = (3^'^'E^^ S'E'i , where S"^ is a sym- 
metric square root of T!i~^. From independency of x and S, y ^ J\fp{9* ,alp) is inde- 
pendent of both A ~ Wp{Ip,n) and ,^^i ^ Xn-p+i- ^l^o note that F = x'S~^x = 
P^^y'A^^y. Therefore using Stein's (1981) identity we get 



E 



x'i:-'{x-9)r{x'S^'x 
x'S^^x 



E 



y'{y-9*)T{F) 



= E 



y'{y-9*)r{F) 



yy 



E 



y'y 

F 



f3{n-p+l)E 



y'{y-9*)r{F) 



yy 



al3{n-p+l)\ {p-2)E 



riF) 



y'y 



2E[r'{F)] 



Similarly 



E 



x'Y.-^xr'^ {x'S^^x) 
ix'S-^x? 



= E 

= P' 


y 

E 


'yrHF) 

rr2(F)1 

. y'y . 


E 



yy 



y'A ^y 



l3^{n~p+l){n-p + 3)E 



r^jF) 

y'y 



Lemma 2.2. The risk function of the estimator Sr (9) w.r.t. the loss function (J1.12p is 



give by 



R{6r{e);9) ^ R{e;e)-'i{N^p) E 



E 



' (e s^^e 



t-^W{dt) 



[N ~ p)r (e' S-^d) 



t\w{dt) 



te (i-is)"'6» 

X N{N -p + 2)r (eS^^e) - 2{p - 2) 
Proof: As far as the representation (II. 4p is concerned, it is possible to continue this way 

R{Sr{e);e) = NE\{dr{9)-e)"E-\Sr{e)-e) 



= R{e;e) -2N E 



e s-^e 



yv{dt) 



poo 

-N I E 
Jo 



e-E-^0r^ (es-^e) 



es-^e 



W{dt). 



(2.1) 



But from e,\t - Np{Q,t-^'S), it is concluded that using p^ . e\t - Np{e,t-^N-^T.) 
is independent of S\t ^ Wp(i~^S,ri). Consequently, by making use of Lemma 2.1 for 
a = (tN)-^ and /3 = i'^ we get 



E 



eY;-\e-e)r(es-^e 



e s-^e 



N -p 



+2E 



{p-2)E 



' (es-^e 



r (es-^e) 



01^-^9 



E 



eYr^hr^ {e s-^6\ 



e s^^e 



{N ~p){N -p + 2) 



E 



(e's-^e) 



en-^e 



After all, substituting the above expressions in (|2.ip . completes the proof 



3 Main Results 

In this section, we demonstrate the minimaxity of the estimator Sr{0), under some 
mild conditions made on the function r(.). Also we give conditions under which dr{6) 
dominates a James-Stein type shrinkage estimator. 

Theorem 3.1. Assume in the model (jl.ip . e^ ^ £p{0,11, g). Then w.r.t. the loss 
function (|1.14p . the estimator Sr (6) is minimax in the sub-class (|1.7p . providing 



(i) r is non- decreasing 

/■•i ^ 2(p-2) 

(tij r < — — — 



N(N-p+2) 

Proof: The estimator 6 given by (jl.lSp is minimax. Therefore, in order to show that 
Sr{0) is minimax it is enough to show that R{9; 6) ~ R{6r{0); 9) > 0. But from Lemma 
2.2 we have R{Sr{e);e) - R{e;e) =A + B, where 



A = -4:{N-p) I E 

10 



'■' (e's-'e) 



r^widt) 



B 



E , 

te {t-^Yiy^e 



N{N -p + 2)r (eS^^d) -2{p-2 



t\w{dt) 

Whereof r(.) is non-decreasing, r' (O S^^Oj > 0. Also following Srivastava and Bilodeau 
(1989), we have 



E 



' (es-^e 



E 



r' (es^^e 



t-^W{dt) 

t t-^yv+{dt)- I E 





'■' (^es-'e) 



t-^W~{dt) 



>0. 



2(p-2) 



S < is 



Therefore A < and by making use of (1.18), A < oo. Taking r < wiv-^n^ ' 
achieved for finite B, which completes the proof. But for demonstrating that ,B < cxi, it 
is sufficient to show that 



(«) 



(«) 



t-'E 



t-'E 



1 



,7V0'(t-iS)"^0 



{es-^ 



NO (t-iS)"^6l 



t j W{dt) < oo. 



t } Widt) < oo. 



(3.1) 



Note that for a fixed t, N0 (i Sj has non-central chi-square distribution with p 
d.f and non-centrality parameter NtO'H^^O. In conclusion, the 



E 



1 


<E 


■ 1 ■ 

.41 


_Ne {t-^T,y^ e _ 



1 



p-2 



is observed and (|3.ip (i) is followed by (|1.7p (ii)-(iu). 

On the other hand, using the covariance inequality (see Lemma 6.6 page 370 of 
Lehmann and Casella, 1998) and equation (1.19) 



2 (es-'9 



E ■ 



NO {t-^T,y^e 



t\ <Elr^ ie s^'e 



t\E- 



NO (t^^ny^e 



t}< oo. 



Therefore dSU (ii) is satisfied by ((TTl) (ii)-(iii). ■ 

In the following, we develop necessary conditions for the shrinkage estimator dr{6) 
to dominate the James-Stein type estimator given by 



Sjs{0) = 






e s-^e 



e. (3.2) 



The performance of this estimator is discussed in Srivastava and Bilodeau (1989) exten- 
sively. The way we derive the necessary conditions honesty is due to Maruyama and 
Strawderman (2005). But this approach has the following superiorities comparing to 
their analysis. (1) We study correlated errors with unknown covariance matrix while 
they considered uncorrelated case. (2) They derived the dominating result for multivari- 
ate normal, and we extend it for ECDs. Although the item (1) is completely different 
from that of uncorrelated, it is worthwhile to note that their conditions are robust under 
departures from normality assumptions. The following result is the same as Corollary 
2.1. of Maruyama and Strawderman (2005). They could also find the class of admissible 
estimators under normal theory with identity covariance matrix. 

Theorem 3.2. Assume that the function r{.) is bounded and absolutely continuous. 
Necessary conditions for an estimator dr{9) to dominate Sjs{0) are that 

(i) for every lo, there exists Wo(> i^) such that r' {ljq) > 0, 

(ii) if ujr' {(jj) has a limiting value as uj approaches infinity, it must be 0, 

(Hi) if r{uj) has a limiting value as uj approaches infinity and ujr' [lj) converges to as 
u! approaches infinity, the limit value for r{ijj) must be aji-jv- +2'i ' 

Proof: Proof of (z) directly follows from the proof of Corollary 2.1. of Maruyama and 



Strawdcrnian (2005). Now consider using Lemma 2.2, one can directly obtain 

A = R{djs{9);e)-R(dr{e);e) 



4{N -p) 



E 



E 



e s-^e 



t-^W{dt) 



{N-p)ip-2) 



X [N{N -p + 2){p - 2) - 2(p - 2)] 

{N -p)r{es-^e 

E' 



t \yv{dt) 



te {t-^Y.y'e 

N{N -p + 2)r (eS^^e) - 2{p - 2) 



t \Widt) 



4{N -p) 



E 



e s-^e 



t-^Widt) 



-{N-p)l E 
/o 



[r(e's-'e)-{p-2)y 



r'^w{dt) 




t-^W{dt) 



{N-p)l E 
Jo 

where z = S^^Y, B = S^^SS^s and 



Gri^) = - 



[rH-(p-2)] 
[n — p — l)uj 



+ 4r'(a;). 



(3.3) 



(3.4) 



For the proofs of [ii) and (iw), using the proofs of [ii) and {in) of Corollary 2.1. of 
Maruyama and Strawderman (2005), it is enough to show that if 5r(0) dominates 6js{0), 
then, for every w, there exists a;o(> w) such that ^r('^o) 1^ 0. In this case we follow the 
proof of Theorem 2.1. of Maruyama and Strawderman (2005). 

Suppose to the contrary that there exists ujq such that Gr{^) < for any uj > loq. 
Under the boundedness of Gr{-), there exists an M(> 0) such that Gr{<^) ^ M for 
any uj. Under the assumption of absolute continuity of Gri-) there exists two points 
(wo <)i^i < 1^2 and e(> 0) such that Gr{^) < — e on u; G [u;i,a;2]- Using M and e, we 
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define Gr.ni^) as 



GrA^) = 



M uj < luq 

OJQ < iU < U!l 

— e LOl < UJ < UJ2 
UJ > LU2 



(3.5) 



The inequality Gr,e(j^) — Srii-^) for any uj and using equation (3.3) imply 



(N^p) 



E 



Griz'B-^z] 



< M{N - p) / PgiW <ujo 



t ) t-'^W{dt) 



~e{N-p) P0(u}i<W<uj2 



t]t-^W{dt), 



(3.6) 



where W = ||X||2 for X = (f-^S)-^©. 

Based on the properties of the model under study, it can be realized that 



PgiW <LJo 



t t-'^W{dt) 



PelW <LJo 



t\ t-'^W+idt) - I Pe(w <ujo 



t]t-^W-{dt) 



t t-^Widt). 



>0. 



This phenomenon is also valid for j' Pg I uji < W < uj2 

Now let a be a fixed p-dimensional unit vector (see Fig. f). Then the half plane 
{x : a'x < -ycjo} includes the p-dimensional hyper-ellipsoid {x : j|a;jp < wq}. For 
6 = {y/uJo + A)(f^-^S)^2a, we have 



Pg{W<LJo 



t < 



a'x<,/U^ (27r)2 



< exp(A-ywo) exp — 



exp 



\x-0\\ 



dx 



(27r)5 



exp 



X 



LUQd'x I dx 



< exp(A-ycJo) exp — 



Wo 

T 



(3.7) 



For N ~ {x : uji < \\x\\^ < 0)2, v^wT < a'x < y/uJ2} and = (y/uJa + X){t ^S) aa, 
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Figure 1: Graph of half planes 



get 



P0{uJi<W <UJ2 



t > 



|^-^^|-%xpr-fc^u. 



N (2^)^ 



> exp(AY^aJi) exp ( — 



It-is|-i 



exp 



In (27r)f '-^ V 2 
By making use of the equations (3.6)-(3.8), we can obtain 

1011' 



ujQa'x ) dx. (3.i 



A<{N-p) ciexp(y^X- 

where a = M exp (^) and 






1 — C2 exp 



exp 



{y/uJ^- ^/oj^)\ 



ujaa'x I dx 



r^widt), 



Since ci and C2 do not depend on A, A is negative for sufficiently large A. This completes 
the proof ■ 



Subsequently, we continue on giving an example of the function r(.). 
Let 



1 + ex ^ 



N{N -p + 2) 



and c e R+ . 



(3.9) 
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Then we have 

0<r4x) < ,,,y~^^ ^, , Dr^x) = , ~ , , > 0, 

^ ^ ~ N{N-p + 2) ^ ' (x + c)2 

hm r^,{x) — 6(p — 2), hm xDr*(x) = 0, 

which satisfy the conditions of Theorems 3.1 and 3.2 

The resulting shrinkage estimator using the function r.^,{.) in p.9p . is the generaUzed 
type of Alam and Thompson (1969) estimator given by 



r. 



(es^^e) 



ip-2)b 



e s^^e J I es^^e + c) 

Also note that based on (1.18) and (1.19), the required conditions of the Schwartz space, 
for this example, are b{p — 2)'^E{X + c)^'^ < oo and 6^(p — 2)'^E ( wj) < oo, which 

summarizes to the sole condition E I -xt^ ) < cxd. 

4 Conclusions 

In this paper, we utilized a broad class of Stein-type estimators which outperformed the 
consistent estimator of the mean of an elliptically contoured model. It is worthwhile to 
note that the minimaxity conditions are identical to that obtain under normal assump- 
tions. Hence, those are robust with respect to departures from normality. Moreover, 
Bayesian perpective dose not offer systematic generality over classical approaches taking 
flat prior information. The class of estimators considered in Srivastava and Bilodeau 
(1989) is broaden into a more general shrinkage estimators; and as a result, this work 
dominates series of Brandwein's and Berger's papers. To the best of my knowledge, it 
is not simple to prove the admissibility of the class of Bayes shrinkage estimator Sr{9) 
under elliptical symmetry and there exists no study in ECDs when the covariance matrix 
in unknown. But one may demonstrate it through taking the harmonic prior ||0|p~P for 
7r(0) in (1.9) which leaves for further research. In this case, one may follow the work of 
Maruyama (2004) under the integral representation of elliptical models in (1.4). In this 
case, the work of Fourdrinier et al. (2003) has some interesting features. 
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